Overview

Dataset statistics

Number of variables18
Number of observations10105
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.4 MiB
Average record size in memory144.0 B

Variable types

Numeric8
Categorical6
Boolean4

Alerts

pdays is highly correlated with previousHigh correlation
previous is highly correlated with pdaysHigh correlation
pdays is highly correlated with previousHigh correlation
previous is highly correlated with pdaysHigh correlation
pdays is highly correlated with previousHigh correlation
previous is highly correlated with pdaysHigh correlation
job is highly correlated with educationHigh correlation
education is highly correlated with jobHigh correlation
df_index is highly correlated with duration and 1 other fieldsHigh correlation
age is highly correlated with job and 1 other fieldsHigh correlation
job is highly correlated with age and 1 other fieldsHigh correlation
marital is highly correlated with ageHigh correlation
education is highly correlated with jobHigh correlation
housing is highly correlated with monthHigh correlation
contact is highly correlated with monthHigh correlation
day is highly correlated with monthHigh correlation
month is highly correlated with housing and 2 other fieldsHigh correlation
duration is highly correlated with df_index and 1 other fieldsHigh correlation
pdays is highly correlated with poutcomeHigh correlation
poutcome is highly correlated with pdaysHigh correlation
deposit is highly correlated with df_index and 1 other fieldsHigh correlation
df_index has unique values Unique
balance has 774 (7.7%) zeros Zeros
previous has 7568 (74.9%) zeros Zeros

Reproduction

Analysis started2022-09-28 07:31:51.328511
Analysis finished2022-09-28 07:32:00.382023
Duration9.05 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct10105
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5627.074715
Minimum0
Maximum11161
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:00.422914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile535.2
Q12852
median5684
Q38413
95-th percentile10603.8
Maximum11161
Range11161
Interquartile range (IQR)5561

Descriptive statistics

Standard deviation3223.261961
Coefficient of variation (CV)0.5728130733
Kurtosis-1.192566309
Mean5627.074715
Median Absolute Deviation (MAD)2776
Skewness-0.03073460102
Sum56861590
Variance10389417.67
MonotonicityStrictly increasing
2022-09-28T12:32:00.467543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
75081
 
< 0.1%
74981
 
< 0.1%
74991
 
< 0.1%
75001
 
< 0.1%
75011
 
< 0.1%
75021
 
< 0.1%
75041
 
< 0.1%
75071
 
< 0.1%
75091
 
< 0.1%
Other values (10095)10095
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
111611
< 0.1%
111601
< 0.1%
111591
< 0.1%
111581
< 0.1%
111571
< 0.1%
111561
< 0.1%
111551
< 0.1%
111541
< 0.1%
111531
< 0.1%
111521
< 0.1%

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct76
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.89549728
Minimum18
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:00.512347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile26
Q132
median38
Q348
95-th percentile61
Maximum95
Range77
Interquartile range (IQR)16

Descriptive statistics

Standard deviation11.73493055
Coefficient of variation (CV)0.286949208
Kurtosis0.6572927811
Mean40.89549728
Median Absolute Deviation (MAD)8
Skewness0.8677036244
Sum413249
Variance137.7085951
MonotonicityNot monotonic
2022-09-28T12:32:00.558744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31459
 
4.5%
32432
 
4.3%
34432
 
4.3%
33431
 
4.3%
35429
 
4.2%
30415
 
4.1%
36393
 
3.9%
37334
 
3.3%
38319
 
3.2%
39317
 
3.1%
Other values (66)6144
60.8%
ValueCountFrequency (%)
188
 
0.1%
1913
 
0.1%
2019
 
0.2%
2129
 
0.3%
2248
 
0.5%
2364
 
0.6%
2489
0.9%
25162
1.6%
26220
2.2%
27221
2.2%
ValueCountFrequency (%)
951
 
< 0.1%
932
< 0.1%
922
< 0.1%
902
< 0.1%
891
 
< 0.1%
882
< 0.1%
873
< 0.1%
864
< 0.1%
853
< 0.1%
842
< 0.1%

job
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
management
2315 
blue-collar
1807 
technician
1638 
admin.
1246 
services
868 
Other values (6)
2231 

Length

Max length13
Median length12
Mean length9.36091044
Min length6

Characters and Unicode

Total characters94592
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowadmin.
2nd rowadmin.
3rd rowtechnician
4th rowservices
5th rowadmin.

Common Values

ValueCountFrequency (%)
management2315
22.9%
blue-collar1807
17.9%
technician1638
16.2%
admin.1246
12.3%
services868
 
8.6%
retired663
 
6.6%
self-employed358
 
3.5%
unemployed332
 
3.3%
student326
 
3.2%
entrepreneur300
 
3.0%

Length

2022-09-28T12:32:00.606518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
management2315
22.9%
blue-collar1807
17.9%
technician1638
16.2%
admin1246
12.3%
services868
 
8.6%
retired663
 
6.6%
self-employed358
 
3.5%
unemployed332
 
3.3%
student326
 
3.2%
entrepreneur300
 
3.0%

Most occurring characters

ValueCountFrequency (%)
e14653
15.5%
n10410
11.0%
a9573
10.1%
m6818
 
7.2%
l6469
 
6.8%
i6305
 
6.7%
c5951
 
6.3%
t5568
 
5.9%
r4901
 
5.2%
d3177
 
3.4%
Other values (12)20767
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter91181
96.4%
Dash Punctuation2165
 
2.3%
Other Punctuation1246
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e14653
16.1%
n10410
11.4%
a9573
10.5%
m6818
 
7.5%
l6469
 
7.1%
i6305
 
6.9%
c5951
 
6.5%
t5568
 
6.1%
r4901
 
5.4%
d3177
 
3.5%
Other values (10)17356
19.0%
Dash Punctuation
ValueCountFrequency (%)
-2165
100.0%
Other Punctuation
ValueCountFrequency (%)
.1246
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin91181
96.4%
Common3411
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e14653
16.1%
n10410
11.4%
a9573
10.5%
m6818
 
7.5%
l6469
 
7.1%
i6305
 
6.9%
c5951
 
6.5%
t5568
 
6.1%
r4901
 
5.4%
d3177
 
3.5%
Other values (10)17356
19.0%
Common
ValueCountFrequency (%)
-2165
63.5%
.1246
36.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII94592
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e14653
15.5%
n10410
11.0%
a9573
10.1%
m6818
 
7.2%
l6469
 
6.8%
i6305
 
6.7%
c5951
 
6.3%
t5568
 
5.9%
r4901
 
5.2%
d3177
 
3.4%
Other values (12)20767
22.0%

marital
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
married
5715 
single
3213 
divorced
1177 

Length

Max length8
Median length7
Mean length6.798515586
Min length6

Characters and Unicode

Total characters68699
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmarried
2nd rowmarried
3rd rowmarried
4th rowmarried
5th rowmarried

Common Values

ValueCountFrequency (%)
married5715
56.6%
single3213
31.8%
divorced1177
 
11.6%

Length

2022-09-28T12:32:00.648329image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-28T12:32:00.689126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
married5715
56.6%
single3213
31.8%
divorced1177
 
11.6%

Most occurring characters

ValueCountFrequency (%)
r12607
18.4%
i10105
14.7%
e10105
14.7%
d8069
11.7%
m5715
8.3%
a5715
8.3%
s3213
 
4.7%
n3213
 
4.7%
g3213
 
4.7%
l3213
 
4.7%
Other values (3)3531
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter68699
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r12607
18.4%
i10105
14.7%
e10105
14.7%
d8069
11.7%
m5715
8.3%
a5715
8.3%
s3213
 
4.7%
n3213
 
4.7%
g3213
 
4.7%
l3213
 
4.7%
Other values (3)3531
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
Latin68699
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r12607
18.4%
i10105
14.7%
e10105
14.7%
d8069
11.7%
m5715
8.3%
a5715
8.3%
s3213
 
4.7%
n3213
 
4.7%
g3213
 
4.7%
l3213
 
4.7%
Other values (3)3531
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII68699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r12607
18.4%
i10105
14.7%
e10105
14.7%
d8069
11.7%
m5715
8.3%
a5715
8.3%
s3213
 
4.7%
n3213
 
4.7%
g3213
 
4.7%
l3213
 
4.7%
Other values (3)3531
 
5.1%

education
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
secondary
5517 
tertiary
3239 
primary
1349 

Length

Max length9
Median length9
Mean length8.412469075
Min length7

Characters and Unicode

Total characters85008
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsecondary
2nd rowsecondary
3rd rowsecondary
4th rowsecondary
5th rowtertiary

Common Values

ValueCountFrequency (%)
secondary5517
54.6%
tertiary3239
32.1%
primary1349
 
13.3%

Length

2022-09-28T12:32:00.725285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-28T12:32:00.767096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
secondary5517
54.6%
tertiary3239
32.1%
primary1349
 
13.3%

Most occurring characters

ValueCountFrequency (%)
r14693
17.3%
a10105
11.9%
y10105
11.9%
e8756
10.3%
t6478
7.6%
s5517
 
6.5%
c5517
 
6.5%
o5517
 
6.5%
n5517
 
6.5%
d5517
 
6.5%
Other values (3)7286
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter85008
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r14693
17.3%
a10105
11.9%
y10105
11.9%
e8756
10.3%
t6478
7.6%
s5517
 
6.5%
c5517
 
6.5%
o5517
 
6.5%
n5517
 
6.5%
d5517
 
6.5%
Other values (3)7286
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin85008
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r14693
17.3%
a10105
11.9%
y10105
11.9%
e8756
10.3%
t6478
7.6%
s5517
 
6.5%
c5517
 
6.5%
o5517
 
6.5%
n5517
 
6.5%
d5517
 
6.5%
Other values (3)7286
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII85008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r14693
17.3%
a10105
11.9%
y10105
11.9%
e8756
10.3%
t6478
7.6%
s5517
 
6.5%
c5517
 
6.5%
o5517
 
6.5%
n5517
 
6.5%
d5517
 
6.5%
Other values (3)7286
8.6%

default
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
False
9939 
True
 
166
ValueCountFrequency (%)
False9939
98.4%
True166
 
1.6%
2022-09-28T12:32:00.803517image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

balance
Real number (ℝ)

ZEROS

Distinct2963
Distinct (%)29.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean807.6535379
Minimum-2049
Maximum4063
Zeros774
Zeros (%)7.7%
Negative682
Negative (%)6.7%
Memory size79.1 KiB
2022-09-28T12:32:00.841128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-2049
5-th percentile-87
Q195
median445
Q31227
95-th percentile3040.4
Maximum4063
Range6112
Interquartile range (IQR)1132

Descriptive statistics

Standard deviation994.1519657
Coefficient of variation (CV)1.230913899
Kurtosis1.112528777
Mean807.6535379
Median Absolute Deviation (MAD)433
Skewness1.309601354
Sum8161339
Variance988338.1308
MonotonicityNot monotonic
2022-09-28T12:32:00.888456image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0774
 
7.7%
139
 
0.4%
234
 
0.3%
334
 
0.3%
55031
 
0.3%
429
 
0.3%
527
 
0.3%
1920
 
0.2%
819
 
0.2%
6218
 
0.2%
Other values (2953)9080
89.9%
ValueCountFrequency (%)
-20491
< 0.1%
-19651
< 0.1%
-19441
< 0.1%
-17011
< 0.1%
-16361
< 0.1%
-15311
< 0.1%
-14891
< 0.1%
-14511
< 0.1%
-14152
< 0.1%
-13861
< 0.1%
ValueCountFrequency (%)
40631
< 0.1%
40621
< 0.1%
40601
< 0.1%
40561
< 0.1%
40541
< 0.1%
40531
< 0.1%
40481
< 0.1%
40471
< 0.1%
40412
< 0.1%
40401
< 0.1%

housing
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
False
5243 
True
4862 
ValueCountFrequency (%)
False5243
51.9%
True4862
48.1%
2022-09-28T12:32:00.932210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

loan
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
False
8712 
True
1393 
ValueCountFrequency (%)
False8712
86.2%
True1393
 
13.8%
2022-09-28T12:32:00.965848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

contact
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
cellular
7283 
unknown
2161 
telephone
 
661

Length

Max length9
Median length8
Mean length7.851558634
Min length7

Characters and Unicode

Total characters79340
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowunknown
2nd rowunknown
3rd rowunknown
4th rowunknown
5th rowunknown

Common Values

ValueCountFrequency (%)
cellular7283
72.1%
unknown2161
 
21.4%
telephone661
 
6.5%

Length

2022-09-28T12:32:01.000549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-28T12:32:01.043681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
cellular7283
72.1%
unknown2161
 
21.4%
telephone661
 
6.5%

Most occurring characters

ValueCountFrequency (%)
l22510
28.4%
u9444
11.9%
e9266
11.7%
c7283
 
9.2%
a7283
 
9.2%
r7283
 
9.2%
n7144
 
9.0%
o2822
 
3.6%
k2161
 
2.7%
w2161
 
2.7%
Other values (3)1983
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter79340
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l22510
28.4%
u9444
11.9%
e9266
11.7%
c7283
 
9.2%
a7283
 
9.2%
r7283
 
9.2%
n7144
 
9.0%
o2822
 
3.6%
k2161
 
2.7%
w2161
 
2.7%
Other values (3)1983
 
2.5%

Most occurring scripts

ValueCountFrequency (%)
Latin79340
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l22510
28.4%
u9444
11.9%
e9266
11.7%
c7283
 
9.2%
a7283
 
9.2%
r7283
 
9.2%
n7144
 
9.0%
o2822
 
3.6%
k2161
 
2.7%
w2161
 
2.7%
Other values (3)1983
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII79340
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l22510
28.4%
u9444
11.9%
e9266
11.7%
c7283
 
9.2%
a7283
 
9.2%
r7283
 
9.2%
n7144
 
9.0%
o2822
 
3.6%
k2161
 
2.7%
w2161
 
2.7%
Other values (3)1983
 
2.5%

day
Real number (ℝ≥0)

HIGH CORRELATION

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.59030183
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:01.078198image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q18
median15
Q322
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.441509961
Coefficient of variation (CV)0.5414590463
Kurtosis-1.068760047
Mean15.59030183
Median Absolute Deviation (MAD)7
Skewness0.1323348611
Sum157540
Variance71.25909042
MonotonicityNot monotonic
2022-09-28T12:32:01.117021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
18493
 
4.9%
20492
 
4.9%
5449
 
4.4%
30430
 
4.3%
15426
 
4.2%
13419
 
4.1%
6413
 
4.1%
14408
 
4.0%
12403
 
4.0%
8394
 
3.9%
Other values (21)5778
57.2%
ValueCountFrequency (%)
1109
 
1.1%
2290
2.9%
3286
2.8%
4364
3.6%
5449
4.4%
6413
4.1%
7359
3.6%
8394
3.9%
9334
3.3%
10148
 
1.5%
ValueCountFrequency (%)
31126
 
1.2%
30430
4.3%
29358
3.5%
28377
3.7%
27262
2.6%
26233
2.3%
25200
2.0%
24112
 
1.1%
23210
2.1%
22236
2.3%

month
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
may
2617 
jul
1418 
aug
1385 
jun
1104 
apr
830 
Other values (7)
2751 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters30315
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmay
2nd rowmay
3rd rowmay
4th rowmay
5th rowmay

Common Values

ValueCountFrequency (%)
may2617
25.9%
jul1418
14.0%
aug1385
13.7%
jun1104
10.9%
apr830
 
8.2%
nov780
 
7.7%
feb709
 
7.0%
oct335
 
3.3%
jan319
 
3.2%
sep278
 
2.8%
Other values (2)330
 
3.3%

Length

2022-09-28T12:32:01.153769image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
may2617
25.9%
jul1418
14.0%
aug1385
13.7%
jun1104
10.9%
apr830
 
8.2%
nov780
 
7.7%
feb709
 
7.0%
oct335
 
3.3%
jan319
 
3.2%
sep278
 
2.8%
Other values (2)330
 
3.3%

Most occurring characters

ValueCountFrequency (%)
a5388
17.8%
u3907
12.9%
m2854
9.4%
j2841
9.4%
y2617
8.6%
n2203
7.3%
l1418
 
4.7%
g1385
 
4.6%
o1115
 
3.7%
p1108
 
3.7%
Other values (9)5479
18.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter30315
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a5388
17.8%
u3907
12.9%
m2854
9.4%
j2841
9.4%
y2617
8.6%
n2203
7.3%
l1418
 
4.7%
g1385
 
4.6%
o1115
 
3.7%
p1108
 
3.7%
Other values (9)5479
18.1%

Most occurring scripts

ValueCountFrequency (%)
Latin30315
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a5388
17.8%
u3907
12.9%
m2854
9.4%
j2841
9.4%
y2617
8.6%
n2203
7.3%
l1418
 
4.7%
g1385
 
4.6%
o1115
 
3.7%
p1108
 
3.7%
Other values (9)5479
18.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII30315
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a5388
17.8%
u3907
12.9%
m2854
9.4%
j2841
9.4%
y2617
8.6%
n2203
7.3%
l1418
 
4.7%
g1385
 
4.6%
o1115
 
3.7%
p1108
 
3.7%
Other values (9)5479
18.1%

duration
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1390
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean368.7426027
Minimum2
Maximum3881
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:01.192706image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile50
Q1137
median252
Q3490
95-th percentile1075.8
Maximum3881
Range3879
Interquartile range (IQR)353

Descriptive statistics

Standard deviation346.6515237
Coefficient of variation (CV)0.9400907873
Kurtosis7.797566486
Mean368.7426027
Median Absolute Deviation (MAD)142
Skewness2.19978852
Sum3726144
Variance120167.2789
MonotonicityNot monotonic
2022-09-28T12:32:01.239051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16136
 
0.4%
9735
 
0.3%
13634
 
0.3%
14433
 
0.3%
11433
 
0.3%
15033
 
0.3%
8732
 
0.3%
9032
 
0.3%
15832
 
0.3%
15232
 
0.3%
Other values (1380)9773
96.7%
ValueCountFrequency (%)
21
 
< 0.1%
31
 
< 0.1%
42
 
< 0.1%
52
 
< 0.1%
66
 
0.1%
715
0.1%
815
0.1%
910
0.1%
1015
0.1%
1110
0.1%
ValueCountFrequency (%)
38811
< 0.1%
32841
< 0.1%
32531
< 0.1%
31831
< 0.1%
31021
< 0.1%
30941
< 0.1%
30761
< 0.1%
27751
< 0.1%
27701
< 0.1%
27691
< 0.1%

campaign
Real number (ℝ≥0)

Distinct35
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.517169718
Minimum1
Maximum43
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:01.285655image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum43
Range42
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.707158757
Coefficient of variation (CV)1.075477246
Kurtosis39.56382556
Mean2.517169718
Median Absolute Deviation (MAD)1
Skewness4.937566318
Sum25436
Variance7.328708537
MonotonicityNot monotonic
2022-09-28T12:32:01.323574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
14331
42.9%
22749
27.2%
31192
 
11.8%
4696
 
6.9%
5347
 
3.4%
6239
 
2.4%
7126
 
1.2%
8117
 
1.2%
964
 
0.6%
1047
 
0.5%
Other values (25)197
 
1.9%
ValueCountFrequency (%)
14331
42.9%
22749
27.2%
31192
 
11.8%
4696
 
6.9%
5347
 
3.4%
6239
 
2.4%
7126
 
1.2%
8117
 
1.2%
964
 
0.6%
1047
 
0.5%
ValueCountFrequency (%)
432
< 0.1%
411
 
< 0.1%
331
 
< 0.1%
322
< 0.1%
311
 
< 0.1%
304
< 0.1%
292
< 0.1%
281
 
< 0.1%
271
 
< 0.1%
263
< 0.1%

pdays
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct458
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.31964374
Minimum-1
Maximum854
Zeros0
Zeros (%)0.0%
Negative7568
Negative (%)74.9%
Memory size79.1 KiB
2022-09-28T12:32:01.366460image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q32
95-th percentile329
Maximum854
Range855
Interquartile range (IQR)3

Descriptive statistics

Standard deviation109.6441789
Coefficient of variation (CV)2.136495324
Kurtosis6.885033932
Mean51.31964374
Median Absolute Deviation (MAD)0
Skewness2.463076113
Sum518585
Variance12021.84596
MonotonicityNot monotonic
2022-09-28T12:32:01.409557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-17568
74.9%
9288
 
0.9%
18277
 
0.8%
18175
 
0.7%
9174
 
0.7%
18368
 
0.7%
18440
 
0.4%
9439
 
0.4%
9338
 
0.4%
9533
 
0.3%
Other values (448)2005
 
19.8%
ValueCountFrequency (%)
-17568
74.9%
18
 
0.1%
28
 
0.1%
41
 
< 0.1%
52
 
< 0.1%
63
 
< 0.1%
82
 
< 0.1%
97
 
0.1%
103
 
< 0.1%
121
 
< 0.1%
ValueCountFrequency (%)
8541
< 0.1%
8421
< 0.1%
8281
< 0.1%
8261
< 0.1%
8051
< 0.1%
8041
< 0.1%
7921
< 0.1%
7841
< 0.1%
7821
< 0.1%
7781
< 0.1%

previous
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8162295893
Minimum0
Maximum58
Zeros7568
Zeros (%)74.9%
Negative0
Negative (%)0.0%
Memory size79.1 KiB
2022-09-28T12:32:01.449830image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum58
Range58
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.243794521
Coefficient of variation (CV)2.748974737
Kurtosis113.4393749
Mean0.8162295893
Median Absolute Deviation (MAD)0
Skewness7.440263081
Sum8248
Variance5.034613851
MonotonicityNot monotonic
2022-09-28T12:32:01.488008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
07568
74.9%
1796
 
7.9%
2612
 
6.1%
3391
 
3.9%
4223
 
2.2%
5147
 
1.5%
6107
 
1.1%
766
 
0.7%
858
 
0.6%
928
 
0.3%
Other values (20)109
 
1.1%
ValueCountFrequency (%)
07568
74.9%
1796
 
7.9%
2612
 
6.1%
3391
 
3.9%
4223
 
2.2%
5147
 
1.5%
6107
 
1.1%
766
 
0.7%
858
 
0.6%
928
 
0.3%
ValueCountFrequency (%)
581
< 0.1%
551
< 0.1%
411
< 0.1%
371
< 0.1%
301
< 0.1%
291
< 0.1%
272
< 0.1%
232
< 0.1%
221
< 0.1%
202
< 0.1%

poutcome
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size79.1 KiB
unknown
7570 
failure
1109 
success
945 
other
 
481

Length

Max length7
Median length7
Mean length6.904799604
Min length5

Characters and Unicode

Total characters69773
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowunknown
2nd rowunknown
3rd rowunknown
4th rowunknown
5th rowunknown

Common Values

ValueCountFrequency (%)
unknown7570
74.9%
failure1109
 
11.0%
success945
 
9.4%
other481
 
4.8%

Length

2022-09-28T12:32:01.527884image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-28T12:32:01.566048image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
unknown7570
74.9%
failure1109
 
11.0%
success945
 
9.4%
other481
 
4.8%

Most occurring characters

ValueCountFrequency (%)
n22710
32.5%
u9624
13.8%
o8051
 
11.5%
k7570
 
10.8%
w7570
 
10.8%
s2835
 
4.1%
e2535
 
3.6%
c1890
 
2.7%
r1590
 
2.3%
f1109
 
1.6%
Other values (5)4289
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter69773
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n22710
32.5%
u9624
13.8%
o8051
 
11.5%
k7570
 
10.8%
w7570
 
10.8%
s2835
 
4.1%
e2535
 
3.6%
c1890
 
2.7%
r1590
 
2.3%
f1109
 
1.6%
Other values (5)4289
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Latin69773
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n22710
32.5%
u9624
13.8%
o8051
 
11.5%
k7570
 
10.8%
w7570
 
10.8%
s2835
 
4.1%
e2535
 
3.6%
c1890
 
2.7%
r1590
 
2.3%
f1109
 
1.6%
Other values (5)4289
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII69773
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n22710
32.5%
u9624
13.8%
o8051
 
11.5%
k7570
 
10.8%
w7570
 
10.8%
s2835
 
4.1%
e2535
 
3.6%
c1890
 
2.7%
r1590
 
2.3%
f1109
 
1.6%
Other values (5)4289
 
6.1%

deposit
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.0 KiB
False
5424 
True
4681 
ValueCountFrequency (%)
False5424
53.7%
True4681
46.3%
2022-09-28T12:32:01.600109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Interactions

2022-09-28T12:31:59.840696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:53.961578image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.363368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.182382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.530277image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.860966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.198623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.521595image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.880663image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.024112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.403358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.226778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.575689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.901745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.248570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.559674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.927209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.094931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.445662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.280668image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.619692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.946214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.288979image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.599704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.968026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.137427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:57.939727image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.325539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.660527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.989045image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.328181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.642748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:32:00.010902image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.194003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:57.989808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.367969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.700829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.031717image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.367009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.684757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:32:00.055652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.247347image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.037609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.410987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.743041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.075880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.407832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.726734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:32:00.094136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.284829image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.089501image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.449234image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.782895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.116149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.445734image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.762741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:32:00.132020image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:54.323707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.137276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.487573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:58.820576image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.156649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.481540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-28T12:31:59.800949image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-09-28T12:32:01.629035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-28T12:32:01.678425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-28T12:32:01.727593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-28T12:32:01.778010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-09-28T12:32:01.833157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-28T12:32:00.208848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-28T12:32:00.328387image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexagejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomedeposit
0059admin.marriedsecondaryno2343.0yesnounknown5may10421-10unknownyes
1156admin.marriedsecondaryno45.0nonounknown5may14671-10unknownyes
2241technicianmarriedsecondaryno1270.0yesnounknown5may13891-10unknownyes
3355servicesmarriedsecondaryno2476.0yesnounknown5may5791-10unknownyes
4454admin.marriedtertiaryno184.0nonounknown5may6732-10unknownyes
5542managementsingletertiaryno0.0yesyesunknown5may5622-10unknownyes
6656managementmarriedtertiaryno830.0yesyesunknown6may12011-10unknownyes
7760retireddivorcedsecondaryno545.0yesnounknown6may10301-10unknownyes
8837technicianmarriedsecondaryno1.0yesnounknown6may6081-10unknownyes
9928servicessinglesecondaryno550.0yesnounknown6may12973-10unknownyes

Last rows

df_indexagejobmaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomedeposit
100951115234housemaidmarriedsecondaryno390.0yesnocellular15jul6593-10unknownno
100961115343admin.singlesecondaryno35.0nonotelephone9nov2081-10unknownno
100971115452technicianmarriedtertiaryno523.0yesyescellular8jul1131-10unknownno
100981115535blue-collarmarriedsecondaryno80.0yesyescellular21nov3821722failureno
100991115634blue-collarsinglesecondaryno-72.0yesnocellular7jul2735-10unknownno
101001115733blue-collarsingleprimaryno1.0yesnocellular20apr2571-10unknownno
101011115839servicesmarriedsecondaryno733.0nonounknown16jun834-10unknownno
101021115932techniciansinglesecondaryno29.0nonocellular19aug1562-10unknownno
101031116043technicianmarriedsecondaryno0.0noyescellular8may921725failureno
101041116134technicianmarriedsecondaryno0.0nonocellular9jul6281-10unknownno